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Abstract. Algebraic effect handlers are a recently popular approach for 
modelling side-effects that separates the syntax and semantics of effectful 
operations. The shape of syntax is captured by functors, and free monads 
over these functors denote syntax trees. The semantics is captured by 
algebras, and effect handlers pass these over the syntax trees to interpret 
them into a semantic domain. 

This approach is inherently modular: different functors can be composed 
to make trees with richer structure. Such trees are interpreted by applying 
several handlers in sequence, each removing the syntactic constructs it 
recognizes. Unfortunately, the construction and traversal of intermedi¬ 
ate trees is painfully inefficient and has hindered the adoption of the 
handler approach. 

This paper explains how a sequence of handlers can be fused into one, 
so that multiple tree traversals can be reduced to a single one and no 
intermediate trees need to be allocated. At the heart of this optimization 
is keeping the notion of a free monad abstract, thus enabling a change 
of representation that opens up the possibility of fusion. We demon¬ 
strate how the ensuing code can be inlined at compile time to produce 
efficient handlers. 


1 Introduction 

Free monads are currently receiving a lot of attention. They are at the heart of 
algebraic effect handlers, a new purely functional approach for modelling side 
effects introduced by Plotkin and Power png. Much of their appeal stems from 
the separation of the syntax and semantics of effectful operations. This is both 
conceptually simple and flexible, as multiple different semantics can be provided 
for the same syntax. 

The syntax of the primitive side-effect operations is captured in a signature 
functor. The free monad over this functor assembles the syntax for the individual 
operations into an abstract syntax tree for an effectful program. The semantics 
of the individual operations is captured in an algebra, and an effect handler folds 
the algebra over the syntax tree of the program to interpret it into a semantic 
domain. 

A particular strength of the approach is its inherent modularity. Different 
signature functors can be composed to make trees with richer structure. Such 
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trees are interpreted by applying several handlers in sequence, each removing the 
syntactic constructs it recognizes. 

Unfortunately, the construction and traversal of intermediate trees is rather 
costly. This inefficiency is perceived as a serious weakness of effect handlers, 
especially when compared to the traditional approach of composing effects with 
monad transformers. While several authors address the cost of constructing 
syntax trees with free monads, efficiently applying multiple handlers in sequence 
has received very little attention. As far as we know, only Kammar et al. [5] 
provide an efficient implementation. Unfortunately, this implementation does not 
come with an explanation. Hence it is underappreciated and ill-understood. 

In this paper we close the gap and explain how a sequence of algebraic effect 
handlers can be effectively fused into a single handler. Central to the paper 
are the many interpretations of the word free. Interpreting free monads as the 
initial objects of the more general term algebras and, in particular, term monads 
provide an essential change of perspective where free theorems enable fusion. The 
codensity monad facilitates the way, turning any term algebra into a monadic 
one for free , and with an appropriate code setup in Haskell the GHC compiler 
takes care of fusion at virtually no cost(^]to the programmer. The result is an 
effective implementation that compares well to monad transformers. 


2 Algebraic Effect Handlers 

The idea of the algebraic effect handlers approach is to consider the free monad 
over a particular functor as an abstract syntax tree (AST) for an effectful 
computation. The functor is used to generate the nodes of a free structure whose 
leaves correspond to variables. This can be defined as an inductive datatype 
Free f for a given functor /. 

data Free f a where 
Var :: a —> Free f a 
Con :: / (Free f a) —> Free f a 

The nodes are constructed by Con , and the variables are given by Var. 

Since a value of type Free fa is an inductive structure, we can define a fold 
for it by providing a function gen that deals with generation of values from Var x, 
and an algebra alg that is used to recursively collapse an operation Con op. 

fold :: Functor f => (/ b -¥■ b) —>■ (a —»■ b) -¥ ( Free f a —>■ b) 
fold alg gen (Var x) = gen x 

fold alg gen (Con op) = alg (fmap (fold alg gen ) op) 

Algebraic effect handlers give a semantics to the syntax tree: one way of doing 
this is by using a fold. 


Yes, almost for free ! 



3 

The behaviour of folds when composed with other functions is described by 
fusion laws. The first law describes how certain functions that are precomposed 
with a fold can be incorporated into a new fold: 

fold alg gen ■ fmap h = fold alg (gen ■ h) (1) 

The second law shows how certain functions that are postcomposed with a fold 
can be incorporated into a new fold: 

k ■ fold alg gen = fold alg' (k ■ gen) (2) 

this is subject to the condition that k ■ alg = alg' ■ fmap k. 

The monadic instance of the free monad is witnessed by the following: 

instance Functor f => Monad (Free f) where 
return x = Var x 
m >= / = fold Con f m 

Variables are the way of providing a return for the monad, and extending a 
syntax tree by means of a function / corresponds to applying that function to 
the variables found at the leaves of the tree. 

2.1 Nondeterminsm 

A functor supplies the abstract syntax for the primitive effectful operations in 
the free monad. For instance, the Nondet functor provides the Or k k syntax for 
a binary nondeterministic choice primitive. The parameter to the constructor of 
type k marks the recursive site of syntax, which indicates where the continuation 
is after this syntactic fragment has been evaluated. 

data Nondet k where 
Or :: k —¥ k —¥ Nondet k 
instance Functor Nondet where 
fmap f (Or x y) = Or (f a:) (/ y ) 

This allows us to express the syntax tree of a computation that nondeterministi- 
cally returns True or False. 

coin :: Free Nondet Bool 

coin = Con (Or (Var True) (Var False)) 

The syntax is complemented by semantics in the form of effect handlers— 
functions that replace the syntax by values from a semantic domain. Using a fold 
for the free monad is a natural way of expressing such functions. For instance, 
here is a handler that interprets Nondet in terms of lists of possible outcomes: 

handle Nondet a Free Nondet a —> [a] 
handle Nondet Q = fold alg Nondet 9 en Nondet[] 
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where alg Nondet[] is the Nondet-algebra, that interprets terms constructed by Or 
operations and gen Nond( , t interprets variables. 

al 9Nondet {] Nondet [a] ->• [a] 
al 9Nondet Q (Or h h) = h -H- k 
9 en Nondet[] :: a —> [a] 

9 en Nondet {] x = I*] 

The variables of the syntax tree are turned into singleton lists by gen Nondet[] , and 
choices between alternatives are put together by algN ondet[] , which appends lists. 
As an example, we can interpret the coin program: 

> handle Nondet {] coin 
[ True , False ] 

This particular interpretation gives us a list of the possible outcomes. 

Generalizing away from the details, handlers are usually presented in the form 

hdl :: Va . Free F a -» H a 

where F and H are arbitrary functors determined by the handler. 

2.2 Handler Composition 

There are many useful scenarios that involve the (function) composition of effect 
handlers. We now consider two classes of this kind of composition. 


Effect Composition A first important class of scenarios is where multiple 
effects are combined in the same program. To this end we compose signatures 
and handlers. 

The coproduct functor f + g makes it easy to compose functors: 

data (+) f g a where 
M::f a^(f + g)a 
Inr :: g a (f + g) a 

instance (Functor f, Functor g) => Functor (f + g) where 
fmap f (Ini s) = Ini (fmap f s) 
fmap f (Inr s) = Inr (fmap f s) 

The free monad of a coproduct functor is a tree where each node can be built 
from syntax from either / or g. 

Composing handlers is easy too: if the handlers are written in a compositional 
style, then function composition does the trick. A compositional handler for the 
functor F has a signature of the form: 


hdl ::Vg a . Free (F + g) a —> H\ (Free g (G\ a)) 



This processes only the F-nodes in the AST and leaves the (-/-nodes as they 
are. Hence the result of the compositional handler is a new (typically smaller) 
AST with only (/-nodes. The variables of type G\ a in the resulting AST are 
derived from the variables of type a in original AST as well as from the processed 
operations. Moreover, the new AST may be embedded in a context Hi. 

For instance, the compositional nondeterminism handler is defined as follows, 
where F = Nondet, G\ - ] and implicitly Hi = Id. 

handleNondet Functor g => Free (Nondet + g) a Free g [a] 
handleNondet = fold (alg Nondet V Con) gen Nondet 

Here the variables are handled with the monadified version of gen Nond et[] , given 
by gen Nondet . 

9 en Nondet Functor g =>■ a —» Free g [a] 
gen Nondet x = ^ ar M 

The g nodes are handled by a Con algebra, which essentially leaves them un¬ 
touched. The Nondet nodes are handled by the alg Nondet algebra, which is a 
monadified version of alg Nondet[] . 

dig Nondet Functor g => Nondet (Free g [a]) -» Free g [a] 

al 9Nondet ( 0r ml l ml l) = 

do {li 4- mli ; h ml 2 ; Var (k -H- fc)} 

The junction combinator (v) composes the algebras for the two kinds of nodes. 

(V)::(fb^b)^(gb^b)^((f + g)b^b) 

(v) algf alg g (Ini s) = algf s 
(V) alg f alg g (Inr s ) = alg g s 

In the definition of handler Nondet, we use alg Nondet V Con. Since the functor 
in question is Nondet + g, the values constructed by Nondet are handled by 
alg Nondet. ■ an< i values constructed by g are left untouched: the fold unwraps one 
level of Con, but then replaces it with a Con again. 

A second example of an effect signature is that of state, whose primitive 
operations Get and Put respectively query and modify the implicit state. 

data State s k where 
Put: : s —> k State s k 
Get :: (s -> k) -> State s k 
instance Functor (State s ) where 
fmap f (Put s k) = Put s (/ k) 
fmap f (Get k) = Get (/ • k ) 

The compositional handler for state is as follows, where F = State s, Hi = s —> — 
and implicitly Gi = Id. 





handlestate ■■ Functor g % Free (State s + g) a —> (s —> Free g a) 
handlestate = fold (alg State V con state) 9 en state 


This time the variable and constructor cases are defined as: 
gen state Functor g => a —> (s —» Free g a) 

gen state x s = V ar x 

dig state Functor g =$- State s (s —» Free g a) —> (s —» -Free g a) 
alg state {Put s' k) s = k s' 
algstate {Get k) s = k s s 

Using gen State , a variable x is replaced by a version that ignores any incoming 
state parameter s. Any stateful constructs are handled by alg S tate-> where a 
continuation k proceeds by using the appropriate state: if the syntax is a Put s' k, 
then the new state is s', otherwise the syntax is Get k, in which case the state is 
left unchanged and passed as s. 

Finally, any syntax provided by g is adapted by constate to take the extra 
state parameter s into account: 

constate ■■ Functor g =$- g (s —» Free g a) —> (s —> Free g a) 
constate op s = Con ( fmap (Am —¥ m s) op) 

This feeds the state s to the continuations of the operation. 

To demonstrate effect composition we can put Nondet and Stdte together and 
handle them both. Before we do so, we also need a base case for the composition, 
which is the empty signature Void. 

data Void k 
instance Functor Void 

The Void handler only provides a variable case since the signature has no 
constructors. In fact, a Free Void term can only be a Var x, so x is immediately 
output using the identity function. 

handle void ■ ■ Free Void a a 
handle void. = fold T id 

Finally, we can put together a composite handler for programs that feature both 
nondeterminism and state. The signature of such programs is the composition of 
the three basic signatures: 

type £ = Nondet + {State Int + Void) 

The handler is the composition of the three handlers, working from the left-most 
functor in the signature: 

handles ■■ Free £ a —>• Int -f. [a] 

handles prog = handle void ■ {handlestate ■ handle Nondet) prog 





Effect Delegation Another important class of applications are those where a 
handler expresses the complex semantics of particular operations in terms of 
more primitive effects. 

For instance, the following logging handler for state records every update of 
the state by means of the Writer effect. 

handle Log state Free (State s) a —» s Free (Writer String + Void ) a 

handle Log state = fold alg LogState gen State 

algLogState State s (s —> Free (Writer String + Void) a) 

—> s —» Free (Writer String + Void) a 
algLogState [Fat s' k) s = Con (Ini (Tell "put" (k s'))) 

olgLogState ( Get k) S = k S S 

The syntax of the Writer effect is captured by the following functor, where w is 
a parameter that represents the type of values that are written to the log: 

data Writer w k where 
Tell:: w —\ k —> Writer w k 
instance Functor (Writer w) where 
fmap f (Tell w k) = Tell w (f k) 

A semantics can be given by the following handler, where w is constrained to be 
a member of the Monoid typeclass. 

handle writer ■■ (Functor g , Monoid w) => Free (Writer w + g) a —> Free g (w , 
handle writer = fold (alg Writer V Con) gen Writer 

The variables are evaluated by pairing with the unit of the monoid given by 
mempty before being embedded into the monad m2. 

gen Writer :: (Monad m2, Monoid w) => a —> m2 ( w , a) 
gen Writer x = re tum (mempty, x) 

When a Tell w-\ k operation is encountered, the continuation k is followed by a 
state where w-| is appended using mappend to any generated logs. 

alg writer :: (Monad m2, Monoid w) => Writer w (m2 (w, a)) —> m2 (w, a) 
alg Writer (Fell w\k) = k >■= X(w2,x) —¥ retum (wi ‘ mappend ‘ W2,x) 

To see this machinery in action, consider the following program that makes use 
of state: 

program :: Int —y Free (State Int) Int 
program n 

| n < 0 = Con (Get var) 

| otherwise = Con (Get (As -> Con (Put (s + n) (program (n — 1 ))))) 


This is then simply evaluated by running handlers in sequence. 





example :: Int —> ( String , Int ) 

example n = (handle void • handle writer ■ handleL og state (program n)) 0 

To fully interpret a stateful program, we must first run handle Log State-, which 
interprets the Tell operations by generating a tree with TTuter String syntax. 
This generated syntax is then handled with the handle writer handler. 

3 Fusion 

The previous composition examples lead us to the main challenge of this paper: 
The composition of two handlers produces an intermediate abstract syntax tree. 
How can we fuse the two handlers into a single one that does not involve an 
intermediate tree? 

More concretely, given two handlers of the form: 

handleri :: Free F 1 a —>• Hi (Free F 2 (Gi a)) 
handleri = fold alg x gen x 
handler2 :: Free F 2 a —> H 2 a 
handler 2 = fold alg 2 gen 2 

where F\ and F 2 are signature functors and Hi, G-\ and H 2 are arbitrary functors, 
our goal is to obtain a combined handler 

pipeline 12 :: Free F\ a —» Hi ( H 2 (G\ a)) 
pipeline 12 = fold alg 12 gen 12 

such that 

fmap handler 2 ■ handleri = pipeline 12 ( 3 ) 

3.1 Towards Proper Builders 

The fact that handleri builds an AST over functor F 2 and that handler 2 folds 
over this AST suggests a particular kind of fusion known as shortcut fusion or 
fold/build fusion g]. 

One of the two key ingredients for this kind of fusion is already manifestly 
present: the fold in handler 2 . Yet, the structure and type of handleri do not 
necessarily force it to be a proper builder: fold/build fusion requires that the 
builder creates the F 2 -AST from scratch by generating all the Var and Con 
constructors itself. Indeed, in theory, handleri could produce the T 2 -AST out of 
ready-made components supplied by (fields of) a colluding Fi functor. 

In order to force handleri to be a proper builder, we require it to be imple¬ 
mented against a builder interface rather than a concrete representation. This 
builder interface is captured in the typeclass TermMonad (explained below), and 
then, with the following constraint polymorphic signature, handleri is guaranteed 
to build properly: 

handleri :: (TermMonad m2 F 2 ) => Free Fi a —> Hi (m 2 (Gi a)) 
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Term Algebras The concept of a term algebra provides an abstract interface 
for the primitive ways to build an AST: the two constructors Con and Var of 
the free monad. We borrow the nomenclature from the literature on universal 
algebras [2], 

A term algebra is an /-algebra con :: / (h a) —¥ h a with a carrier h a. The 
values in ha are those generated by the set of variables a with a valuation 
function var :: a —¥ h a, as well as those that arise out of repeated applications 
of the algebra. This is modelled by the typeclass TermAlgebra h f as follows: 

class Functor f => TermAlgebra h f \ h —> f where 
var :: Va . a —¥ h a 
con :: V a . f (ft a) —¥ h a 

The function var is used to embed a variable into the term, and the function 
con is used to construct a term from existing ones. This typeclass is well-defined 
only when ft a is indeed generated by var and con. 

The most trivial instance of this typeclass is of course that of the free monad. 

instance Functor f => TermAlgebra (Free /) / where 
var = Var 
con = Con 


Term Monads There are two additional convenient ways to build an AST: the 
monadic primitives return and (>=). 

A monad m is a term monad for a functor /, if it there is a term algebra for 
/ whose carrier is m a. We can model this relationship as a typeclass with no 
members. 

class ( Monad m, TermAlgebra m f) => TermMonad m f \ m—rf 
instance ( Monad m , TermAlgebra m f ) => TermMonad m f 

Again, the free monad is the obvious instance of this typeclass. 
instance Functor f => TermMonad (Free /) / 

Its monadic primitives are implemented in terms of fold, con and var. In the 
abstract builder interface TermMonad we only partially expose this fact, by 
means of the following two laws. Firstly, the var operation should coincide with 
the monad’s return. 

var = return (4) 

Secondly, the monad’s bind (>=) should distribute through con. 

con op >= ft = con (fmap (^>=k) op) (5) 

This law states that a term constructed by an operation, where the term is 
followed by a continuation k, is equivalent to constructing a term from an 
operation where the operation is followed by ft. In other words, the arguments of 
an operation correspond to its continuations. 
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Examples All the compositional handlers of Section |2.2| can be easily expressed 
in terms of the more abstract TermMonad interface. For example, the revised 
nondeterminism handler looks as follows. 

handle' Nondet :: TermMonad m g => Free (Nondet + g) a —>■ m [a] 
handle Nondet = fold (alg Nondet Vcon) gen' Nondet 
genNondet :: TermMonad m g => a —»• m [a] 

9 en 'Nondet X = Var I 2 "] 

a ^9Nondet TermMonad m g => Nondet (m [a]) — > m [a] 
alg'Nondet (Or mli ml 2) = 
do { k «- mli ; h rrth ; var (li -H- h )} 

Notice that not much change has been necessary. We have generalized away from 
Free g into a type m that is constrained by TermMonad m g. 

3.2 Parametricity: Fold/Build Fusion for Free 

Hinze et al. [B] state that we get fold/build fusion for free from the free theo¬ 
rem \TMT\ of the builder’s polymorphic type. Hence, let us consider what the 
new type of handleri buys us. 

Theorem 1. Assume that Fi, F2, Hi, Gi and A are closed types, with Fi, 
F 2 and Hi functors. Given a function h of type Vm . (TermMonad m F 2 ) => 
Free Fi A —>• Hi (m (Gi A)), two term monads Mi and M 2 and a term monad 
morphism a :: Va . Mi a —¥ M 2 a, then: 

fmap a ■ h Ml = h M2 (6) 

where the subscripts of h denote the instantiations of the polymorphic type 
variable m. 

If handler 2 is a term monad morphism, then we can use the free theorem to 
determine pipeline 12 in one step, starting from Equation ([3]). 

fmap handler 2 ■ handleri = pipeline 12 
= { Parametricity ([6]), assuming handler 2 is a term monad morphism } 

handleri = pipeline 12 

Unfortunately, handler 2 is not a term monad morphism for the simple reason 
that H 2 is just an arbitrary functor that does not necessarily have a monad 
structure. Hence, in general H 2 is only term algebra. 

instance TermAlgebra H 2 F 2 where 
var = gen 2 
con = alg 2 

Fortunately, we can turn any term algebra into a term monad, thanks to the 
codensity monad, which is what we explore in the next section. 
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3.3 Codensity: TermMonads from TermAlgebras 

The Codensity Monad It is well-known that the codensity monad Cod turns any 
(endo)functor h into a monad (in fact, h need not even be a functor at all). It 
simply instantiates the generalised monoid of endomorphisms (e.g., see [TB]) in 
the category of endofunctors. 

newtype Cod h a = Cod {unCod :: Mx . {a —¥ h x) —> h x} 
instance Monad (Cod h ) where 
return x = Cod (A k —i k x) 

Cod m f = Cod (A k —l m (Xa —» unCod (f a) k)) 


TermMonad Construction Given any term algebra h for functor /, we have that 
Cod h is also a term algebra for /. 

instance TermAlgebra h f =>• TermAlgebra (Cod h ) / where 

var = return 

con = alg Cod con 

algGod. Functor f => (Wx . f (h x) -> h x) (f (Cod h a) —l Cod h a) 

°,lgGod a ^9 °P = God (A k —> alg (fmap (Am —> unCod m k ) op)) 

Moreover, Cod h is also a term monad for /, even if h is not. 

instance TermAlgebra h f => TermMonad (Cod h) f 

The definition of var makes it easy to see that it satisfies the first term monad 
law in Equation Q. The proof for the second law, Equation ([5]) is less obvious: 

con op >= / 

= { unfold (»=) } 

Cod (A k —> unCod (con op) (A a —> unCod (/ a) k)) 

= { unfold con } 

Cod (A k —y unCod (alg Cod con op) (A a —> unCod (/ a) k)) 

= { unfold alg Cod } 

Cod (A k —y unCod (Cod (Xk -t con (fmap (Am —> unCod m k) op))) (A a -> unCod (/ a) k)) 
= { apply unCod ■ Cod = id } 

Cod (Xk —l (Xk con (fmap (Am —¥ unCod m k) op)) (Xa —> unCod (f a) k)) 

= { apply /3-reduction } 

Cod (Xk —l con (fmap (Am —> unCod m (A a —> unCod (f a) k)) op)) 

= { apply /3-expansion } 

Cod (Xk —l con (fmap (Am —> (Xk —¥ unCod m (A a —> unCod (f a) k )) k) op)) 

= { apply unCod ■ Cod = id} 

Cod (Xk —> con (fmap (Am —> unCod (Cod (Xk —> unCod m (A a —> unCod (f a) k))) k) op)) 
= { fold (»=) } 

Cod (Xk —y con (fmap (Am —> unCod (m >=/) k) op)) 

= { apply /3-expansion } 
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Cod (A k —> con {fmap (Am —> (Am -t unCod m k) (m >= fc)) op)) 

= { apply /3-expansion } 

Cod (A k —> con {fmap (Am —> (Am —t unCod m k) ((Am —> m »= fc) m)) op)) 

s { fold (-) } 

Cod (A k —> con {fmap (Am —t ((Am —t unCod m k) ■ (Am —»• m >= fc)) m) op)) 
= { apply //-reduction } 

Cod (Afc —> con {fmap ((Am —> unCod m k) ■ (Am —)• m >= fc)) op)) 

= { apply /map-fission and unfold (■) } 

Cod (A k con {fmap (Am —> unCod m k) {fmap (Am —► m »= k) op))) 

= { fold alg Cod } 

olgCod. con {finO’P (Am —> m >= k) op) 

= { fold con and //-reduce } 

con {fmap (>=/) op) 


3.4 Shifting to Codensity 

Now we can write handler 2 as the composition of a term monad morphism 
handler2 with a post-processing function runCod gen 2 - 

handler2 Free F2 a —> H2 a 
handler2 = runCod gen 2 ■ handler' 2 
handler'2 :: Free F 2 a —> Cod H 2 a 
handler^ = fold folg cod a l92) var 
runCod :: {a —» / x) —¥ Cod f a —t / x 
runCod g m= unCod m g 

This decomposition of handler2 hinges on the following property: 

fold alg 2 gen 2 = runCod gen 2 ■ fold {alg Cod alg 2 ) var (7) 

This equation follows from the second fusion law for folds Q, provided that: 

gen 2 = runCod gen 2 ■ var 

runCod gen 2 • alg Cod alg 2 = alg 2 • fmap {runCod gen 2 ) 

The former holds as follows: 

runCod gen 2 ■ var 
= { unfold • } 

Xx —¥ runCod gen 2 {var x) 

= { unfold runCod and var } 

Xx —> unCod {Cod {Xk —► k x )) gen 2 
= { apply runCod ■ Cod = id} 

Xx —> {Xk —> k x) gen 2 
= { /3-reduction } 

Xx —> gen 2 x 
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= { '(-/-reduction } 

gen 2 

and the latter: 

runCod gen 2 ■ alg Cod alg 2 
= { unfold • } 

A op —> runCod gen 2 ( alg Cod alg 2 op) 

= { unfold runCod and alg Cod } 

A op —» unCod (Cod (A k — > alg 2 (fmap (Am —> unCod m k) op))) gen 2 
= { apply runCod ■ Cod = id } 

Xop —> (A k —> alg 2 (fmap (Am —» unCod m k) op)) gen 2 
= { /3-reduction } 

A op —> alg 2 (fmap (Am —> unCod m gen 2 ) op) 

= { fold runCod } 

A op —> alg 2 (fmap (Am —> runCod gen 2 m) op) 

= { r/-reduction } 

A op alg 2 (fmap (runCod gen 2 ) op) 

= { fold • } 

alg 2 ■ fmap (runCod gen 2 ) 


3.5 Fusion at Last 

Finally, instead of fusing fmap handler2 ■ handleri we can fuse fmap handler 2 ■ 
handleri using the free theorem. This yields: 

pipeline' 12 = fold alg 1 gen 1 

Now we can calculate the original fusion: 

fmap handle^ • handleri 
= { decomposition of handler 2 } 

fmap (runCod gen 2 ■ handler 2) • handleri 
= { fmap fission } 

fmap (runCod gen 2 ) ■ fmap handler 2 ■ handleri 
= { free theorem } 

fmap (runCod gen 2 ) ■ handleri 

In other words, the fused version can be defined as: 

pipeline 12 = fmap (runCod gen 2 ) ■ fold alg 1 gen 1 

Observe that this version only performs a single fold and does not allocate any 
intermediate tree. 
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3.6 Repeated Fusion 

Often a sequence of handlers is not restricted to two. Fortunately, we can easily 
generalize the above to a pipeline of n handlers 

fmap n handler n ■ ... ■ fmap handleri ■ handlero 

where handleri = fold alg i gen i ( i £ l..n). This pipeline fusion can start by 
arbitrarily fusing two consecutive handlers handlerj and handleri using the 
above approach, and then incrementally extending the fused kernel on the left and 
the right with additional handlers. These two kinds of extensions are explained 
below. 


Fusion on the Right Suppose that fmap handle r 2 ■ handleri is composed 
with another handler on the right: 

handlero (TermMonad m\ F\) =>■ Free Fg a —> Ho (mi (Go a)) 
handlero = fold alg 0 gen 0 

to form the pipeline: 

pipeline 012 = fmap (fmap handler 2 • handleri) ■ handlero 

Can we perform the fusion twice to obtain a single fold and eliminate both 
intermediate trees? Yes! The first fusion, as before yields: 

pipeline 0 i2 = fmap (fmap (runCod gen 2 ) ■ handleri) ■ handlero 

Applying fmap fission and regrouping, we obtain: 

pipeline 0 i2 = fmap (fmap (runCod gen 2 )) ■ (fmap handleri • handlero ) 

Now the right component is another instance of the binary fusion problem, which 
yields: 

pipeline 0 i 2 = fmap (fmap (runCod gen 2 )) ■ fmap (runCod geni) ■ handlero 


Fusion on the Left Suppose that handler 2 has the more specialised type: 

handler 2 :: (TermMonad m 3 F 3 ) => Free F 2 a —► H 2 (m 2 (G 2 a)) 

then we can compose fmap handler2 ■ handleri on the left with another handler: 

handler 3 :: Free F 3 a —» H 3 a 
handler 3 = fold alg 3 gen 3 


This yields a slightly more complicated fusion scenario: 
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pipeline 123 = fmap (fmap handlers ■ handler 2) ■ handlers 

Of course, we can first fuse handler 2 and handlers. That would yield an instance 
of fusion on the right. However, suppose we first fuse handlers and handler2, 
after applying fmap fission. 

pipeline 123 = fmap (fmap handlers) ■ fmap (runCod gen 2 ) ■ handlers 

Now we can shift the carrier of handlers to codensity and invoke the free theorem 
on fmap (runCod gen 2 ) • handlers • This accomplishes the second fusion. 

pipeline 123 = fmap (fmap (runCod gen 3 )) ■ fmap (runCod gen 2 ) ■ handlers 


Summary An arbitrary pipeline of the form: 

fmap n handler n ■ ... • fmap handlers • handlero 
where handler * = fold alg t gen t (i € l..n) fuses into 

fmap n (runCod g n ) ■ ... • fmap (runCod gens) ' handlero 


3.7 Fusion all the Way 

We are not restricted to fusing handlers, but can fuse all the way, up to and 
including the expression that builds the initial AST and to which the handlers are 
applied. Consider for example the coin example of Section [2~T| The free theorem 
of corn’s type is a variant of Theorem 1: 

a coin = coin 

where a :: Va . Ms a M2 a is a term monad morphism between any two 
term monads Ms and M2. We can use this to fuse handleNondet coin into 
runCod gen Nondet coin. Of course this fusion interacts nicely with the fusion of 
a pipeline of handlers. 

4 Pragmatic Implementation and Evaluation 

This section turns the fusion approach into a practical Haskell implementation 
and evaluates the performance improvement. 


4.1 Pragmatic Implementation 

Before we can put the fusion approach into practice, we need to consider a few 
pragmatic implementation aspects. 
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Inlining with Typeclasses In a lazy language like Haskell, fusion only leads 
to a significant performance gain if it is performed statically by the compiler 
and combined with inlining. In the context of the GHC compiler, the inlining 
requirement leaves little implementation freedom: GHC is rather reluctant to 
inline in general recursive code. There is only one exception: GHC is keen to 
create type-specialised copies of (constraint) polymorphic recursive definitions 
and to inline the definitions of typeclass methods in the process. 

In short, if we wish to get good speed-ups from effect handler fusion, we need 
to make sure that the effectful programs are polymorphic in the term monad and 
that all the algebras are held in typeclass instances. For this reason, all handlers 
should be made instances of TermAlgebra. 

Explicit Carrier Functors The carrier functor of the compositional state handler 
is s — y m2— . From the category theory point of view, this is clearly a functor. 
However, it is neither an instance of the Haskell Functor typeclass nor can it be 
made one in this syntactically invalid form. A new type needs to be created that 
instantiates the functor typeclass: 

newtype StateCarrier s m a = SC {unSC :: s —> m a} 
instance Functor m => Functor (StateCarrier s m) where 
fmap f x = SC (fmap (fmap f) (unSC x)) 
instance TermMonad m2 f 

TermAlgebra (StateCarrier s m2) (State s + f) where 
con = SC ■ (alg' state Vconstate) • fmap unSC 
var = SC ■ gen’ state 

9 en ' state ■■ TermMonad mf=>a-t-(s—tma) 

9 en 'State x = cons t ( var x ) 

alg'State :: TermMonad m f => State s (s —» m a) -» (s m a) 
alg'state (Put s' k) s = k s' 

alg'State (Get k) s = k s s 

Now the following function is convenient to run a fused state handler. 

runStateC :: TermMonad m2 f => Cod (StateCarrier s ma) a —> (s —> m2 a) 
runStateC = unSC ■ runCod var 


Unique Carrier Functors Even though the logging state handler has ostensibly 
the same carrier s m2— as the regular compositional state handler, we cannot 
reuse the same functor. The reason is that in the typeclass-based approach the 
carrier functor must uniquely determine the algebra; two different typeclass 
instances for the same type are forbidden. Hence, we need to write a new set of 
definitions as follows: 

newtype LogStateCarrier s m a = LSC { unLSC :: s —> m a} 

4 The typeclass constraints on m2 are different. 
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instance Functor m =>■ Functor (LogStateCarrier s m) where 
fmap f x = LSC (fmap (fmap f) (unLSC x )) 
instance TermMonad m2 (Writer String + Void ) 

=> TermAlgebra (LogStateCarrier s m2) (State s) where 
var = LSC ■ gen' state 
con = LSC ■ alg LogState ■ fmap unLSC 
runLogStateC :: TermMonad m® (Writer String + Void) 

=>■ Cod (LogStateCarrier s m2) a —»• (s —> m2 a) 
runLogStateC = unLSC ■ runCod var 


4.2 Evaluation 

To evaluate the impact of fusion we consider several benchmarks implemented in 
different ways: running handlers over the inductive definition of the free monad 
(Free) and to its Church encoding (Church), the fully fused definitions (Fused), 
and the conventional definitions of the state monad from MTL (MTL). 

The benchmarks are run in the Criterion benchmarking harness using GHC 
7.10.1 on a MacBook Pro with a 3 GHz Intel Core i7 processor, 16 GB RAM 
and Mac OS 10.10.3. All values are in milliseconds, and show the ordinary least- 
squares regression of the recorded samples; the R 2 goodness of fit is above 0.99 
in all instances. 


Benchmark 

Free 

Church 

Fused 

MTL 

count 1 

10 7 

1,017 

1,311 

3 

3 

10 8 

10,250 

13,220 

29 

29 

10 9 

103,000 

129,700 

291 

295 

count2 

10 6 

684 

746 

167 

213 

10 7 

6,937 

7,344 

1,740 

2,157 

10 8 

102,700 

98,010 

17,220 

20,300 

counts 

10 6 

559 

555 

166 

205 

10 7 

5,759 

5,561 

1,618 

2,132 

10 8 

110,900 

94,760 

16,300 

20,120 

grammar 

794 

763 

6 

77 

pipes 

1,325 

1,351 

43 

N/A 


The count 1 benchmark consists of the simple count-down loop used by Kam- 
mar et al. [9]. 
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count i 
< I () get 

if % = 0 then return i 
else put (i — 1); count i 

We have evaluated this program with three different initial states of 10 7 , 10 s and 
10 9 . The results show that all representations scale linearly with the size of the 
benchmark. However, the fused representation is about 300 times faster than 
the free monad representations and matches the performance of the traditional 
monad transformers. 

The count2 benchmarks extends the county benchmark with a tell operation 
from the Writer effect in every iteration of the loop. It is run by sequencing 
the state and writer handler. The improvement due to fusion is now much less 
extreme, but still quite significant. 

The county benchmark is the county program, but run with the logging state 
handler that delegates to the writer handler. The runtimes are slightly better 
than those of count2- 

The grammar benchmark implements a simpler parser by layering the state 
and non-determinism effects. Again fusion has a tremendous impact, even con¬ 
siderably outperforming the MTL implementation. 

The pipes benchmark consists of the simple producer-consumer pipe used by 
Kammar et al. [5]- We can see that fusion provides a significant improvement 
over either free monad representation. There is no sensible MTL implementation 
to compare with for this benchmark. 

The results (in ms) show that the naive approaches based on intermediate 
trees, either defined inductively or by Church encoding, incur a considerable 
overhead compared to traditional monads and monad transformers. Yet, thanks 
to fusion they can easily compete or even slightly outperform the latter. 

5 Related Work 

5.1 Fusion 

Fusion has received much attention, and was first considered as the elimination 
of trees by Wadler [23] with his so-called deforestation algorithm, which was then 
later generalized by Chin [3] . 

From the implementation perspective, Gill et al. first introduced the notion 
of shortcut fusion to Haskell [3], thus allowing programs written as folds over 
lists to be fused. Takano and Meijer showed how fusion could be generalized to 
arbitrary datastructures m • The technique of using free theorems to explain 
certain kinds of fusion was applied by Voigtlander [20]. 

Work by Hinze et al. [5] builds on recursive coalgebras to show the theory 
and practice of fusion, limited to the case of folds and unfolds. Later work by 
Harper [5] provides a pragmatic discussion, that bridges the gap between theory 
and practice further, by discussing the implementation of fusion in GHC with 
inline rules. Harper also considers the fusion of Church encodings. 
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More recently, recursive coalgebras appear in work by Hinze et al. [7] where 
conjugate hylomorphisms are introduced as a means of unifying all presently 
known structured recursion schemes. There, the theory behind fusion for general 
datatypes across all known schemes is described as arising from naturality laws 
of adjunctions. Special attention is drawn to fusion of the cofree comonad, which 
is the dual case of free monads we consider here. 


5.2 Effect Handlers 

Plotkin and Power were the first to explore effect operations m, and gave an 
algebraic account of effects [T3] and their combination [8] . Subsequently, Plotkin 
and Pretnar m have added the concept of handlers to deal with exceptions. 
This has led to many implementation approaches. 

Lazy Languages Many implementations of effect handlers focus on the lazy 
language Haskell. 

For the sake of simplicity and without regard for efficiency, Wu et al. [21] 
use the inductive datatype representation of the free monad for their exposition. 
They use the Data Types a la Carte approach m to conveniently inject functors 
into a co-product; that approach is entirely compatible with this paper. Wu et 
al. also generalize the free monad to higher-order functors-, we expect that our 
fusion approach generalizes to that setting. 

Kiselyov et al. m provide a Haskell implementation in terms of the free 
monad too. However, they combine this representation with two optimizations: 1) 
the codensity monad improves the performance of (>=), and 2) their Dynamic- 
based open unions have a better time complexity than nested co-products. Due 
to the use of the codensity monad, this paper also benefits from the former 
improvement. Moreover, we believe that the latter improvement is unnecessary 
due to the specialisation and inlining opportunities that are exposed by fusion. 

Van der Ploeg and Kiselyov m present an implementation of the free monad 
with good asymptotic complexity for both pattern matching and binding; un¬ 
fortunately, the constant factors involved are rather high. This representation is 
mainly useful for effect handlers that cannot be easily expressed as folds, and 
thus fall outside of the scope of the current paper. 

Behind a Template Haskell frontend Kammar et al. [9] consider a range of 
different Haskell implementations and perform a performance comparison. Their 
basic representation is the inductive datatype definition, with a minor twist: the 
functor is split into syntax for the operation itself and a separate continuation. 
This representation is improved with the codensity monad. Finally, they provide— 
without explanation—a representation that is very close to the one presented 
here; their use of the continuation monad instead of the codensity monad is a 
minor difference. 

Both Atkey et al. pQ and Schrijvers et al. [T7] study the interleaving of a free 
monad with an arbitrary monad, i.e., the combination of algebraic effect handlers 
and conventional monadic effects. We believe that our fusion technique can be 
adapted for optimizing the free monad aspect of their settings. 


Strict Languages In the absence of lazy evaluation, the inductive datatype 
definition of the free monad is not practical. 

Kammar et al. [9] briefly sketch an implementation based on delimited contin¬ 
uations. Schrijvers et al. PU show the equivalence between a delimited continua¬ 
tions approach and the inductive datatype; hence the fusion technique presented 
in this paper is in principle possible. However, in practice, the codensity monad 
used for fusion is likely not efficient in strict languages. Hence, effective fusion 
for strict languages remains to be investigated. 


5.3 Monad Transformers 

Monad transformers, as first introduced by Liang et al. El. pre-date algebraic 
effect handlers as a means for modelling compositional effects. Yet, there exists a 
close connection between both approaches: monad transformers are fused forms 
of effect handlers. What is particular about their underlying effect handlers is 
that their carriers are term algebras with a monadic structure, i.e., term monads. 
This means that the Cod construction is not necessary for fusion. 

6 Conclusion 

We have explained how to fuse algebraic effect handlers by shifting perspective 
from free monads to term monads. Our benchmarks show that, with a careful 
code setup in Haskell, this leads to good speed-ups compared to the free monad, 
and allows algebraic effect handlers to compete with the traditional monad 
transformers. 
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